Title of dataverse: Historical covidestim runs 

Contact information: covidestim research team
contact about data / project
Fayette Klaassen - postdoc fklaassen@hsph.harvard.edu
Nicolas Menzies - PI nmenzies@hsph.harvard.edu
Ted Cohen - PI theodore.cohen@yale.edu
Joshua Salomon - PI salomon1@stanford.edu


Structure of dataverse:

This dataverse consists of 5 datasets that each contain a partition of historical model runs.

- 2020 daily:	The output from the daily covidestim runs over the course of 2020
- 2021-1 daily: 	The output from the daily covidestim runs over the first half of 2021 (Jan - Jun)
- 2021-2 daily:	The output from the daily covidestim runs over the second half of 2021 (Jul - Dec)
- 2022 daily:	The output from the daily covidestim runs in 2022 (Jan - Feb)
- 2022-2024 weekly: The output from the weekly covidestim runs over the course of 2022-2024 (not continuous)	

This read me document contains further documentation on the GitHub repositories that describe the workflow used to produce the output in the Estimate datasets.


File structure:

The structure of the folders and files in the 5 Estimate datasets is similar.
Folders are labeled by date (yyyy-mm-dd) of the model run, and may or may not contain
- County level results at the first level: estimates.csv.gz; summary.pack.gz> 
- State level results at the second level: state/<estimates.csv.gz; summary.pack.gz>

The <estimates.csv.gz> files contain the raw/compressed line listed results, and <summary.pack.gz> is a webpackage for AWS hosting.


Workflow/code to produce the daily/weekly estimates:

The workflow to produce the daily/weekly estimates is contained by a few GitHub repositories, all hosted as subsidiary repositories of the github.com/covidestim organization.

- dailyFlow:		a collection of scripts describing the Nextflow routine, that coordinates the steps from the other subsidiary repositories
- covidestim-sources: 	the repository that scrapes the relevant online data sources, assembles and cleans them for analysis
- covidestim:		the repository that contains the R package and stan code to run the analysis and render the final estimates
- webworker:		a repository facilitating serializing the analyses and packaging the serialized outputs into packaged and zipped files